glm2: Fitting Generalized Linear Models with Convergence Problems
نویسنده
چکیده
The R function glm uses step-halving to deal with certain types of convergence problems when using iteratively reweighted least squares to fit a generalized linear model. This works well in some circumstances but non-convergence remains a possibility, particularly with a nonstandard link function. In some cases this is because step-halving is never invoked, despite a lack of convergence. In other cases step-halving is invoked but is unable to induce convergence. One remedy is to impose a stricter form of stephalving than is currently available in glm, so that the deviance is forced to decrease in every iteration. This has been implemented in the glm2 function available in the glm2 package. Aside from a modified computational algorithm, glm2 operates in exactly the same way as glm and provides improved convergence properties. These improvements are illustrated here with an identity link Poisson model, but are also relevant in other contexts. It is not too uncommon for iteratively reweighted least squares (IRLS) to exhibit convergence problems when fitting a generalized linear model (GLM). Such problems tend to be most common when using a nonstandard link function, such as a log link binomial model or an identity link Poisson model. Consequently, most commonly used statistical software has the provision to invoke various modifications of IRLS if non-convergence occurs. In the stats package of R, IRLS is implemented in the glm function via its workhorse routine glm.fit. This routine deals with specific types of convergence problems by switching to step-halving if iterates display certain undesirable properties. That is, if a full Fisher scoring step of IRLS will lead to either an infinite deviance or predicted values that are invalid for the model being fitted, then the increment in parameter estimates is repeatedly halved until the updated estimates no longer exhibit these features. This is achieved through repeated application of the call start <(start + coefold)/2 where coefold and start contain estimates from the previous and current iterations, respectively. Although this approach works well in some contexts, it can be prone to fail in others. In particular, although the step-halving process in glm.fit will throw an errant iterative sequence back into the desired region, the sequence may repeatedly try to escape that region and never converge. Furthermore, it is even possible for the IRLS iterative sequence to be such that step-halving is never invoked in glm.fit, yet the sequence does not converge. Such behavior is typically accompanied by a deviance sequence that increases in one or more of the iterations. This suggests a modification to glm.fit which has been implemented in the glm2 package (Marschner, 2011). As motivation for the proposed modification, we begin by discussing the potential for non-convergence using some numerical examples of the above types of behavior. The glm2 package is then discussed, which consists of a main function glm2 and a workhorse routine glm.fit2. These are modified versions of glm and glm.fit, in which step-halving is used to force the deviance to decrease from one iteration to the next. It is shown that this modification provides improved convergence behavior.
منابع مشابه
A distributed algorithm for fitting generalized additive models
Generalized additive models are an effective regression tool, popular in the statistics literature, that provides an automatic extension of traditional linear models to nonlinear systems. We present a distributed algorithm for fitting generalized additive models, based on the alternating direction method of multipliers (ADMM). In our algorithm the component functions of the model are fit indepe...
متن کاملFast stable direct fitting and smoothness selection for Generalized Additive Models
Existing computationally efficient methods for penalized likelihood GAM fitting employ iterative smoothness selection on working linear models (or working mixed models). Such schemes fail to converge for a non-negligible proportion of models, with failure being particularly frequent in the presence of concurvity. If smoothness selection is performed by optimizing ‘whole model’ criteria these pr...
متن کاملA New Inexact Inverse Subspace Iteration for Generalized Eigenvalue Problems
In this paper, we represent an inexact inverse subspace iteration method for computing a few eigenpairs of the generalized eigenvalue problem Ax = Bx [Q. Ye and P. Zhang, Inexact inverse subspace iteration for generalized eigenvalue problems, Linear Algebra and its Application, 434 (2011) 1697-1715 ]. In particular, the linear convergence property of the inverse subspace iteration is preserved.
متن کاملFitting of Count Time Series Models on the Number of Patients Referred to Addiction Treatment Centers in Semnan County
Abstract. Count data over time are observed in many application areas. Many researchers use time series patterns to analyze this data. In this paper, the poisson count time series linear models and negative binomials on this type of data with the explanatory variables are studied. The Likelihood analysis and the evaluation of count time series model based on generalized linear models are pres...
متن کاملبهکارگیری مدل جمعیتعمیمیافته در تعیین نوع ارتباط عوامل خطر رتینوپاتی در بیماران دیابتی شهر تهران
Background : One of the most important complications of diabetes, is diabetic retinopathy that causes the blindness of 10,000 people every year. Different researches have been done on retinopathy risk factors in diabetic patients. This study was carried out to check the type of relationship between retinopathy risk factors and the condition of temptation it with generalized additive models. T...
متن کامل